30 research outputs found
Reinforcement Learning for Automatic Test Case Prioritization and Selection in Continuous Integration
Testing in Continuous Integration (CI) involves test case prioritization,
selection, and execution at each cycle. Selecting the most promising test cases
to detect bugs is hard if there are uncertainties on the impact of committed
code changes or, if traceability links between code and tests are not
available. This paper introduces Retecs, a new method for automatically
learning test case selection and prioritization in CI with the goal to minimize
the round-trip time between code commits and developer feedback on failed test
cases. The Retecs method uses reinforcement learning to select and prioritize
test cases according to their duration, previous last execution and failure
history. In a constantly changing environment, where new test cases are created
and obsolete test cases are deleted, the Retecs method learns to prioritize
error-prone test cases higher under guidance of a reward function and by
observing previous CI cycles. By applying Retecs on data extracted from three
industrial case studies, we show for the first time that reinforcement learning
enables fruitful automatic adaptive test case selection and prioritization in
CI and regression testing.Comment: Spieker, H., Gotlieb, A., Marijan, D., & Mossige, M. (2017).
Reinforcement Learning for Automatic Test Case Prioritization and Selection
in Continuous Integration. In Proceedings of 26th International Symposium on
Software Testing and Analysis (ISSTA'17) (pp. 12--22). AC
Constraint-Guided Test Execution Scheduling: An Experience Report at ABB Robotics
Automated test execution scheduling is crucial in modern software development
environments, where components are frequently updated with changes that impact
their integration with hardware systems. Building test schedules, which focus
on the right tests and make optimal use of the available resources, both time
and hardware, under consideration of vast requirements on the selection of test
cases and their assignment to certain test execution machines, is a complex
optimization task. Manual solutions are time-consuming and often error-prone.
Furthermore, when software and hardware components and test scripts are
frequently added, removed or updated, static test execution scheduling is no
longer feasible and the motivation for automation taking care of dynamic
changes grows. Since 2012, our work has focused on transferring technology
based on constraint programming for automating the testing of industrial
robotic systems at ABB Robotics. After having successfully transferred
constraint satisfaction models dedicated to test case generation, we present
the results of a project called DynTest whose goal is to automate the
scheduling of test execution from a large test repository, on distinct
industrial robots. This paper reports on our experience and lessons learned for
successfully transferring constraint-based optimization models for test
execution scheduling at ABB Robotics. Our experience underlines the benefits of
a close collaboration between industry and academia for both parties.Comment: SafeComp 202
Acquiring Qualitative Explainable Graphs for Automated Driving Scene Interpretation
The future of automated driving (AD) is rooted in the development of robust,
fair and explainable artificial intelligence methods. Upon request, automated
vehicles must be able to explain their decisions to the driver and the car
passengers, to the pedestrians and other vulnerable road users and potentially
to external auditors in case of accidents. However, nowadays, most explainable
methods still rely on quantitative analysis of the AD scene representations
captured by multiple sensors. This paper proposes a novel representation of AD
scenes, called Qualitative eXplainable Graph (QXG), dedicated to qualitative
spatiotemporal reasoning of long-term scenes. The construction of this graph
exploits the recent Qualitative Constraint Acquisition paradigm. Our
experimental results on NuScenes, an open real-world multi-modal dataset, show
that the qualitative eXplainable graph of an AD scene composed of 40 frames can
be computed in real-time and light in space storage which makes it a
potentially interesting tool for improved and more trustworthy perception and
control processes in AD
Detecting Intentional AIS Shutdown in Open Sea Maritime Surveillance Using Self-Supervised Deep Learning
In maritime traffic surveillance, detecting illegal activities, such as
illegal fishing or transshipment of illicit products is a crucial task of the
coastal administration. In the open sea, one has to rely on Automatic
Identification System (AIS) message transmitted by on-board transponders, which
are captured by surveillance satellites. However, insincere vessels often
intentionally shut down their AIS transponders to hide illegal activities. In
the open sea, it is very challenging to differentiate intentional AIS shutdowns
from missing reception due to protocol limitations, bad weather conditions or
restricting satellite positions. This paper presents a novel approach for the
detection of abnormal AIS missing reception based on self-supervised deep
learning techniques and transformer models. Using historical data, the trained
model predicts if a message should be received in the upcoming minute or not.
Afterwards, the model reports on detected anomalies by comparing the prediction
with what actually happens. Our method can process AIS messages in real-time,
in particular, more than 500 Millions AIS messages per month, corresponding to
the trajectories of more than 60 000 ships. The method is evaluated on 1-year
of real-world data coming from four Norwegian surveillance satellites. Using
related research results, we validated our method by rediscovering already
detected intentional AIS shutdowns.Comment: IEEE Transactions on Intelligent Transportation System
Opening the Software Engineering Toolbox for the Assessment of Trustworthy AI
Trustworthiness is a central requirement for the acceptance and success of human-centered artificial intelligence (AI). To deem an AI system as trustworthy, it is crucial to assess its behaviour and characteristics against a gold standard of Trustworthy AI, consisting of guidelines, requirements, or only expectations. While AI systems are highly complex, their implementations are still based on software. The software engineering community has a long established toolbox for the assessment of software systems, especially in the context of software testing. In this paper, we argue for the application of software engineering and testing practices for the assessment of trustworthy AI. We make the connection between the seven key requirements as defined by the European Commission’s AI high-level expert group and established procedures from software engineering and raise questions for future work.publishedVersio